Proteomic waves in networks of transcriptional regulators 
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A chain of connected genes with activation-repression links is analysed. It is shown that for 
various promoter activity functions (parametrised by Hill coefficient) the equations describing the 
concentrations of transcription factors are perturbed completely integrable differential-difference of 
KdV-type. In the case of large Hill coefficient the proteomic signal along the gene network is given 
by a superposition of perturbed dark solitons of defocusing differential-difference mKdV equation. 
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Genetically precoded responses of bacteria (and all liv- 
ing organisms) to external perturbations and signals is 
achieved through networks of genes of high complexity. 
The interaction of genes aims at regulating each others 
activity and thus leads to the specialised response [H. 
Typically a gene is subject to the regulatory effect of 
other genes which can act on it in either an activating 
or a suppressing way, depending on the situation. The 
predominant topic of many experimental and theoretical 
studies on genetic circuits so far has been the combina- 
torial control of transcription, which, to a large extent 
determines the connectivity of the network Q . It is thus 
very important to study and understand the dynamics of 
the gene regulatory networks. 

This also will allow to design specific gene regulatory 
networks performing certain functions. The problem of 
designing is quite complicated due to the poorly under- 
standing of the "design principles". Up to now a large 
amount of studies have been performed in vitro and in 
vivo to understand the molecular mechanisms. As an out- 
come, at least at the bacterial level, it has been shown a 
hierarchical organization in motifs, modules and games 
Q - so we have a rather modular than a molecular or- 
ganisation. 

For example these networks display bistability, oscil- 
latory behaviour of some combinations of repressors and 
the level of description similar to electronic engineering 
blueprints seems to be quite appropriate [3] • 

In this paper we are going to show that under very sim- 
ple assumptions a chain of transcriptional regulators hav- 
ing inducer-repressor design can display complex and in 
the same time controlable dynamics. Namely the "gene 
expression" dynamics of the gene can "propagate" to the 
other genes in a "solitary wave" manner and as a read- 
out the concentrations of transcription factor proteins 
can have "solitonic" dependence with respect to the la- 
bel of genes. More precisely the proteic concentrations 
will obey, for various values of Hill coefficient in the 
promoter activity, differential-difference (semidiscrete) 
Korteweg de Vries (KdV) equation-types perturbed, of 
course, with terms containing degradation rates. We will 
discuss the non-Hill case as well. Because in a previous 



paper we analysed a general chain of genes having only 
activator-activator or repressor-repressor interaction and 
found conditions for the existence of bistable states [B| 
it is natural to analyse also the alternating gene net- 
works . The activity of a gene is regulated by other genes 
through the production of transcription factor (TF) pro- 
teins. Physically, this is accomplished through the in- 
teraction of these transcription factor proteins with the 
RNA polymerase complex in the regulatory region of the 
gene. At the bacterial level the mechanism is the fol- 
lowing. The code segment of the DNA chain is read by 
RNA polymerase complex (RNAp) thus producing the 
RNA messenger acid (mRNA). This one goes to the ri- 
bozomal machine which produces the protein according 
to the codon distribution in the gene segment. This pro- 
tein will be at its turn the transcription factor for a new 
transcriptional process. In order to build a mathemat- 
ical model of this process, one must first describe the 
binding of the RNA polymerase molecule to the DNA 
promoter, namely a region which is the beginning of the 
encoded string. In a thermodynamical description 
the promoter activity is proportional to the equilibrium 
probability g of the binding of the RNA polymerase to 
the core promoter. In the case of the simplest processes, 
namely simple activation or suppression, the dependence 
of g on the cellular TF concentrations (which we shall 
denote by p) is described by the Arrhenius form [3] 
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for suppression where to a = exp(— AGa-p/ RT) is the 
Boltzmann weight of the activator-RNAp interaction, 
Ka dissociation constant between the protein and re- 
spective operator sequence in the regulatory region and 
A is the effect of promoter leakage in repression. Also for 
the ribozomal activity the function describing this will 
have a linear form as 

f(rn) = vm — /i 
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where v is the protein synthesis rate at full activation 
which can have a large span of values (0-lOOnM/min) 
and n is related to the fact that a certain amount of 
mRNA is not coded [I]. 

From these basic ingredients, we can write the dy- 
namical equations for one gene. Two steps can be dis- 
tinguished. First the RNA polymerase produces RNA- 
messenger acid m 

dm/dt = gA,Ft(p) - A m m 

where A" 1 is the mRNA half-life which is around 5 min. 
Next, the RNA-messenger acid goes to a ribosomal ma- 
chine and TF proteins are produced according to the 
equation: 

dp/ dt = f(m) — 7p = vm — fj, — 7P 

where 7 -1 is the protein half-life which is ten times higher 
than mRNA one. 

Since the kinetics of RNA messenger production are 
rapid compared to those of the TF proteins, it is not 
unreasonable to make a steady-state assumption for the 
reaction leading to their production, and thus we have 
m = g{p)/\ m . In this case the equation for the tran- 
scription factor production associated to a single gene is 
given by: 
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In a more general context a detailed analysis of the in- 
teraction of RNA polymerase with the DNA chain shows 
that the binding probability has a more general form as 
a "stiffer" sigmoid namely, 
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where a and (3 are strictly positive numbers (for instance 
(3 /a in the activator case is proportional to the Boltz- 
mann weight uja and p is normalised by dividing it to 
the dissociation constant p —> p/ Ka)- Activation occurs 
for a < (3 whilst repression for a > [3. Experimentally 
in the activator case 10 <J3/a < 100 and in the repres- 
sor case is slighly bigger [1(. The exponent a is the Hill 



coefficient which can take only nonegative values. Bio- 
logically it represents the cooperativity in the promoter 
activity and is related to the number of operator-bound 
transcription factors interacting with RNAp. 

In this paper we shall consider a chain of genes where 
each gene is in interaction with two others, the effect of 
which can be either activating or suppressing. From the 
one-gene model we presented above, we can generalise 
the equation (p]) for any gene labeled with n write as: 



= 9A(Pn+l) + 9r(Pti-i) - l(n)p n 



(3) 

where g(p n ) is given by equation ^ assuming of course 
that all genes have the same promoter activity and ne- 
glect the self-activation of the gene n. The above equa- 
tion can be in some sense misleading because we sim- 
ply add the functions gA{jPn+i) and gR(p n -i)- ln reality 
the problem is much more intricate since the possible 
interaction between the transcription factors p n +i and 
Pn-i can modify drastically the promoter activity func- 
tion Q and a more elaborate model must take it into 
account . These interaction may appear not only be- 
cause of the overlapping between them but also from the 
possible DNA looping [9(. In our model we assume that 
there is no interaction between transcription factors act- 
ing independently and thus the sum of the functions is a 
good approximation. 

Here we are going to consider a gene network with 
activation-repression interaction between the genes for 
general a. Accordingly the chain equation is given by, 



dpn _ a + PpZ+i P + aPn-i 
dt ~ 1+K+i 
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Here we have a strongly nonlinear differential-difference 
equation which gives the distribution in time of the tran- 
scription factors for all the genes In order to put it into 
a more tractable form we are going to use the following 
substitution (valid since p n (t) is always positive) 



Pn = e 2 ^° 



and consider that f3 = a+L where L is a positive number. 
Then the equation will take the form: 
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Using the fact that 
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then after time rescaling with L one gets: 
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e " J = (tanh0„+i - tanh0„_i) + (2a - a + L)/L - yy(n)/L)t 



Since a and 7(71) are small the above equation can be 
seen as being: 

^e~^"^ = tanh0„ + i — tanh0„_i + perturbation 

Now in order to see the structure of solutions we are 
going to analyse the above equation without perturbation 
for different cr's. For a = 1 putting 

4>n = r log(— - 1), < u n (t) < 1 
2 u n 

one gets the celebrated integrable differential-difference 
Koretweg de Vries (KdV) equation 

u n = ?4(u„+i - u n -i) 

It is known that it admits multisoliton solution for arbi- 
trary number of solitons and any value of wave parame- 
ters. Here we have to analyse if this type of solution ex- 
ists in the interval (0, 1). The simplest way of computing 
soliton solution is by means of Hirota bilinear formalism 
ficj |. Considering the solution u n = F n+ iF n -i/F% the 
equation is turned into a bilinear one 

~D t F n • F n+ i — F n -iF n+ 2 + F n F n+ i = 

where the first term is written using the Hirota bilinear 
operator (D t a • b = ab — ab). The 1 soliton solution is 
given by F n = 1 + exp(fcn + Lot) with lo = 2 sinh k and 
k is the free wave parameter. Immediately one can see 
that u n = F n +iF n -i/ F 2 > 1 for every k so the soliton is 
not in the interval. If we try a dark type soliton (a hole 
shape instead of a pulse shape having value wq for n very 
big) namely u n = wq — G n /F n the the equation is turned 
into a more complicated bilinear one, 

D*G„ • F n — G n +iF n -i — G„_iF„ + i 

Fn+iF^t = F 2 - (2/w )G„F„ + {l/wl)G 2 n 

The 1-soliton solution is G n — exp(fcn + u)t),F n 
1 + bcxp(kn + u>t) + cexp(2kn + 2tot) where lo — 2 sinh k, 
b = -2e k /w (e k - l) 2 and c = e 2k /wl{e 2k - l) 2 + 
Ae 3k /w 2 ,(e k + l) 2 (e k — I) 4 Here, because F n at the denomi- 
nator have negative terms it is possible that solution may 
explode in finite time. This means that at some gene the 
protein production increase suddenly to the saturation. 
The values for k such that u n be in (0,1) are exactly in 
this exploding region. Accordingly, for a = I there is a 
dark soliton- like proteomic wave which will "fire" a spe- 
cific gene due to the singular character and make it work 
at saturation. 



In the case a >> 1 making an expansion in the left 
exponential and rescaling time we obtain: 




= tanh(/)„ + i — tanh0„_i 



and with the substitution </>„ = tanh w n with — I < 
w n < I 

w n = (1 - w 2 n )(w n+ i - W n -i) 

which is the defocusing differential-discrete modified Ko- 
rteweg de Vries (mKdV) equation. This equation which 
has been analysed in detail in [ll| admits only dark-type 
soliton as localised solutions. 

Assuming that the positive nonzero boundary is < 
wo < 1 the I dark soliton has the following rather com- 
plicated form: 

G„+iF„_i(l + w ) - G n -iF n+1 (l - w ) 
G n+1 F n _i(l + w ) + G n - 1 F n+1 (l - w ) 

where 

G n (t) = 1 + ae kn -^, F n {t) = 1 + e kn ~"\ 

and lo = 2(1 — w 2 ) sinh k, 

_ (1 + Wo) exp(-fc) + (1 - w ) exp(fc) - 2 
(1 + w ) exp(fc) + (I - wo) cxp(-fc) - 2 

which gets wider and wider as k increase. They are in 
the domain (—1, 1) and the exploding region is avoided 
provided |fc| < log((l + Wo)/(l — wq)) . Moreover the in- 
tegrable character of the equation exhibits general dark 
multisoliton solutions which at the interaction flip po- 
larity [ll| Multisoliton solution are constructed as usual 
bilinear Hirota formalism does. Biologically, these soli- 
tons can be seen as a propagation of proteomic signal for 
some genes in off or lower " gene expression" state in a 
background of proteins for genes in on or higher "gene 
expression" state. Of course the on- off picture is ap- 
proximative since the level of gene expression is given by 
the soliton amplitude (depth in the dark case) in vari- 
ous points be more complicated. The interesting fact is 
that the bigger is the amplitude (i.e. the closer is k to 
log((l + wo)/l — wo))) the wider is the hole in the soliton 
shape. Moreover when two different amplitude solitons 
collide the deeper one swallows the other and during the 
interaction the last one is turned into a bright one (re- 
versed polarity), namely in the off region appear some 
genes in the state on. This picture occurs for an arbi- 
trary number of dark solitons in interaction. Moreover k 
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is completely free so is the amplitude i.e. gene expression. 
The presence of perturbation will vary the amplitude and 
wave numbers in a decreasing way. 

Biologically , these solitons are propagating in "space 
of genes" since n is the index of the gene and it can 
be in any position subjected only to the interaction with 
specific transcription factors generated by the genes n + 1 
and n — 1. In addition they represent the transcription 
factors distribution in time on the genes. In bacteria 
usually a is not very big, so apparently the condition for 
obtaining the above discrete mKdV is missing. In any 
case had we started with the following promoter-activity 
function 

9a(p) = a + L tanh(p) gn(p) — a + L — L tanh(p) 

(having a sigmoidal shape as well) we would have ob- 
tained 



dp n 
dt 



tanhp„ + i — tanhp„_i + (2a — fi+L)/ L+j(n)/ Lp n 



which can be transformed immediately in the perturbed 
defocusing mKdV and with the same substitution but 
confined into a smaller domain < w n < 1. This in 
turn decrease the domain for k in the soliton solution. 
Accordingly, the distribution of transcription factors is 
given by the solution of a perturbed nonlinear scmidis- 
crete evolution equation which for one operator site is 
KdV (in the nonsolitonic sector) and for multi-operator 
sites is the defocusing mKdV (in the dark-soliton sector) . 

Now many important things arise from this model. 
First of all the parameters involved in our equations are 
tunable (a, (3, v) 0. As an outcome one can in principle 
control de genomic signal propagation. Moreover the in- 
tegrable character of the equation shows that the results 
are valid for a large domains of wave numbers (k) in the 
solitonic solutions. So, in principle one can start with 
many possible initial conditions and to expect a nice dy- 
namics. Of course from the experimental point of view 
this is extremely difficult but this is exactly what one 
wants from a model - to exhibit a rather stable dynamics 
and to have trains of proteomic signals from various lo- 
calised initial conditions. Even though nowadays usually 
the experimentators work with a few number of genes 
it is to be expected that in the future will be possible 
to construct big modules with big number of genes in 
interaction. Our model will serve as a possible robust 
transmission network similar to the solitonic fiber opti- 
cal devices. 



Of course many open questions remain. First of all 
we did not consider the stochastic effects in the produc- 
tion of mRNA which is extremely important. The proper 
execution of the genetic program relies on faithful signal 
propagation from one gene to the next. This process may 
be hindered by stochastic fluctuations arising from gene 
expression, because some of the components in the net- 
work may be present at low numbers, which makes fluctu- 
ations in concentrations unavoidable 121 ]. However, how 



expression fluctuations propagate from one gene to the 
next is largely unknown. But at least a model should in- 
clude the stochastic term in the equation. Another aspect 
which we neglected is the fact that equation for mRNA 
production is in fact a differential-delay since in the pro- 
moter activity function everything should be taken at 
delayed time. This delay is a very important ingredient 
in bistability or oscillation analysis and of course should 
be included in more elaborate model. 
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